Search CORE

498 research outputs found

Opportunities and challenges in using AI Chatbots in Higher Education

Author: Evans Chris
Yang Shanshan, (Researcher in computer science)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/11/2019
Field of study

Artificial intelligence (AI) conversational chatbots have gained popularity over time, and have been widely used in the fields of e-commerce, online banking, and digital healthcare and well-being, among others. The technology has the potential to provide personalised service to a range of consumers. However, the use of chatbots within educational settings is still limited. In this paper, we present three chatbot prototypes, the Warwick Manufacturing Group, University of Warwick, are currently developing, and discuss the potential opportunities and technical challenges we face when considering AI chatbots to support our daily activities within the department. Three AI virtual agents are under development: 1) to support the delivery of a taught Master's course simulation game; 2) to support the training and use of a newly introduced educational application; 3) to improve the processing of helpdesk requests within a university department. We hope this paper is informative to those interested in using chatbots in the educational domain. We also aim to improve awareness among those within the chatbot development industry, in particular the chatbot engine providers, about the educational and operational needs within educational institutes, which may differ from those in other domains

Warwick Research Archives Portal Repository

SOA services in higher education

Author: Joy Mike
Yang Shanshan, (Researcher in computer science)
Publication venue
Publication date: 01/11/2008
Field of study

Service Oriented Architecture (SOA) is a recent architectural framework for distributed software system development in which software components are packaged as Services. It has become increasingly popular in academia and in industry, but has been principally used in the business domain. However, in higher education, SOA has rarely been applied or investigated. In this paper, we propose the idea of applying SOA technologies in the education domain, to increase both interoperability and flexibility within the e-learning environment. We expect that both students and teachers in higher educational institutions can benefit from this approach. We also describe a number of possible SOA services, along with a high level service roadmap to support a university's learning and teaching activities

Warwick Research Archives Portal Repository

Relative depth estimation from single monocular images with deep convolutional network

Author: Yang Alex (M.S. in computer science)
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

Field of study: Computer science.Dr. Grant Scott, Thesis Supervisor."December 2017."Depth estimation from single monocular images is a theoretical challenge in computer vision as well as a computational challenge in practice. This thesis addresses the problem of depth estimation from single monocular images using a deep convolutional neural fields framework; which consists of convolutional feature extraction, superpixel dimensionality reduction, and depth inference. Data were collected using a stereo vision camera, which generated depth maps though triangulation that are paired with visual images. The visual image (input) and computed depth map (desired output) are used to train the model, which has achieved 83 percent test accuracy at the standard 25 percent tolerance. The problem has been formulated as depth regression for superpixels and our technique is superior to existing state-of-the-art approaches based on its demonstrated its generalization ability, high prediction accuracy, and real-time processing capability. We utilize the VGG-16 deep convolutional network as feature extractor and conditional random fields depth inference. We have leveraged a multi-phase training protocol that includes transfer learning and network fine-tuning lead to high performance accuracy. Our framework has a robust modular nature with capability of replacing each component with different implementations for maximum extensibility. Additionally, our GPU-accelerated implementation of superpixel pooling has further facilitated this extensibility by allowing incorporation of feature tensors with exible shapes and has provided both space and time optimization. Based on our novel contributions and high-performance computing methodologies, the model achieves a minimal and optimized design. It is capable of operating at 30 fps; which is a critical step towards empowering real-world applications such as autonomous vehicle with passive relative depth perception using single camera vision-based obstacle avoidance, environment mapping, etc.Includes bibliographical references (pages 61-65)

University of Missouri: MOspace

An effective services framework for sharing educational resources

Author: Yang Shanshan, (Researcher in computer science)
Publication venue
Publication date
Field of study

Nowadays, the growing number of software tools to support e-learning and the data they rely upon are valuable resources, supporting different aspects of the complex learning and teaching processes, including designing learning content, delivering learning activities, and evaluating students’ learning performance. However, sharing these educational resources efficiently and effectively is a challenge: there are many resources, these have not been described accurately and in general they do not interoperate, and it is common for the tools to rely on different technologies. This thesis explores a solution – a novel educational services framework – to improve the sharing of current e-resources, by applying the latest service technologies in the context of higher education. Our findings suggest that the proposed framework is effective to deal with the technical and educational issues in resource discovery, interoperability and reusability, however, there are still technical challenges remaining for implementing this service framework. This research is divided into 3 phases. The first phase investigates the sharing of elearning resources through a literature survey, and identifies limitations on current developments. In the second phase, the current problems relating to resource sharing are addressed by a proposed educational service framework, which contains both educational and technical components. Through a case study, nine e-learning services and their dataflows are identified. To determine the technical components of the framework, a novel Educational Service Architecture is proposed, which allows resources to be better described, structured and connected, by following the principles of discoverability, interoperability and reusability in service technologies. In the third phase, part of the framework is implemented and evaluated by two studies. In the first study, users’ experiences were collected via a simulation experiment, to compare the effectiveness of a service prototype with that of the use of current technologies. During the second part of the evaluation, technical challenges for implementing the services framework were identified via a case study, involving the implementation of another service prototype

Warwick Research Archives Portal Repository

Efficient algorithms for scalable video coding

Author: Lu Xin (Researcher in Computer science)
Publication venue
Publication date
Field of study

A scalable video bitstream specifically designed for the needs of various client terminals, network conditions, and user demands is much desired in current and future video transmission and storage systems. The scalable extension of the H.264/AVC standard (SVC) has been developed to satisfy the new challenges posed by heterogeneous environments, as it permits a single video stream to be decoded fully or partially with variable quality, resolution, and frame rate in order to adapt to a specific application. This thesis presents novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute Difference (MAD) prediction model. The proposed fast inter-frame and inter-layer mode selection algorithm is based on the empirical observation that a macroblock (MB) with slow movement is more likely to be best matched by one in the same resolution layer. However, for a macroblock with fast movement, motion estimation between layers is required. Simulation results show that the algorithm can reduce the encoding time by up to 40%, with negligible degradation in RD performance. The proposed hierarchical fast mode selection scheme comprises four levels and makes full use of inter-layer, temporal and spatial correlation aswell as the texture information of each macroblock. Overall, the new technique demonstrates the same coding performance in terms of picture quality and compression ratio as that of the SVC standard, yet produces a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode selection algorithms, the proposed algorithm achieves a superior computational time reduction under very similar RD performance conditions. The existing SVC rate distortion model cannot accurately represent the RD properties of the prediction modes, because it is influenced by the use of inter-layer prediction. A separate RD model for inter-layer prediction coding in the enhancement layer(s) is therefore introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy is maintained to within 0.07% on average. As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction model for the spatial enhancement layers is proposed that considers the MAD from previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction. Simulation results indicate that the proposedMADprediction model reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation

Warwick Research Archives Portal Repository

Recommended from our members

Discovering gated recurrent neural network architectures

Author: Rawal Aditya, Ph. D. in computer science
Publication venue
Publication date: 07/02/2019
Field of study

Reinforcement Learning agent networks with memory are a key component in solving POMDP tasks. Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many supervised sequential processing tasks such as speech recognition and machine translation. However, scaling them to deep memory tasks in reinforcement learning domain is challenging because of sparse and deceptive reward function. To address this challenge first, a new secondary optimization objective is introduced that maximizes the information (Info-max) stored in the LSTM network. Results indicate that when combined with neuroevolution, Info-max can discover powerful LSTM-based memory solutions that outperform traditional RNNs. Next, for the supervised learning tasks, neuroevolution techniques are employed to design new LSTM architectures. Such architectural variations include discovering new pathways between the recurrent layers as well as designing new gated recurrent nodes. This dissertation proposes evolution of a tree-based encoding of the gated memory nodes, and shows that it makes it possible to explore new variations more effectively than other methods. The method discovers nodes with multiple recurrent paths and multiple memory cells, which lead to significant improvement in the standard language modeling benchmark task. The dissertation also shows how the search process can be speeded up by training an LSTM network to estimate performance of candidate structures, and by encouraging exploration of novel solutions. Thus, evolutionary design of complex neural network structures promises to improve performance of deep learning architectures beyond human ability to do so.Computer Science

Texas ScholarWorks

activePDF-Toolk

Author: Computer Science Engineering Department Washington University in St. Louis
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

This document provides information for deploying activePDF Toolkit Professional in a development environment. This document is organized into four sections: Getting Started, Tutorials, Technical Reference and the Toolkit Appendices. The Getting Started section covers setup and installation, includes a product overview and information related to operating Toolkit Professional. Tutorials includes examples of many Toolkit features, including PDF generation and form filling. All of the tutorials can be used with activePDF Toolkit. Technical Reference provides detailed information on Toolkit’s objects, subobjects, methods and properties

Washington University St. Louis: Open Scholarship

Interactive Manipulation of 3D Scene Projections

Author: Washington University in St. Louis Department of Computer Science Engineering
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2005
Field of study

Linear perspective is a good approximation to the format in which the human visual system conveys 3D scene information to the brain. Artists expressing 3D scenes, however, create nonlinear projections that balance their linear perspective view of a scene with elements of aesthetic style, layout and relative importance of scene objects. Manipulating the many parameters of a linear perspective camera to achieve a desired view is not easy. Controlling and combining mul-tiple such cameras to specify a nonlinear projection is an even more cumbersome task. This paper presents a direct interface, where an artist manipulates in 2D the desired projection of a few features of the 3D scene. The features represent a rich set of constraints which deﬁne the overall projection of the 3D scene. Desirable properties of local linear perspective and global scene coherence drive a heuristic algorithm that attempts to interactively satisfy the sketched constraints as a weight-averaged projection of a minimal set of linear perspective cameras. This paper shows that 2D fea-ture constraints are a direct and effective approach to control both the 2D layout of scene objects and the conceptually complex, high dimensional parameter space of nonlinear scene projection. The simplicity of our interface also makes it an appealing alternative to standard through-the-lens and widget based techniques to control a single linear perspective camera

Washington University St. Louis: Open Scholarship

Recommended from our members

Building robust and modular question answering systems

Author: Chen Jifan (Ph. D. in Computer Science)
Publication venue
Publication date: 29/07/2023
Field of study

Over the past few years, significant progress has been made in QA systems due to the availability of annotated datasets on a large scale and the impressive advancements in large-scale pre-trained language models. Despite these successes, the black-box nature of end-to-end trained QA systems makes them hard to interpret and control. When these systems encounter inputs that deviate from their training data distribution or are subjected to adversarial perturbations, their performance tends to deteriorate by a large margin. Furthermore, they may occasionally produce unanticipated results, potentially leading to confusion among users. Additionally, this deficiency in robustness and interpretability poses challenges when deploying such models in real-world scenarios. In this dissertation, we aim to build robust QA systems by explicitly decomposing various QA tasks into distinct sub-modules, each responsible for a particular aspect of the overall QA process. Through this decomposition, we seek to achieve improved performance in terms of both the system's ability to handle diverse and challenging inputs (robustness) and its capacity to provide transparent and explainable reasoning (interpretability). To address the aforementioned limitations, in this dissertation, we aim to build robust QA models by explicitly decomposing different QA tasks into different sub-modules. We argue that utilizing these sub-modules can substantially improve the robustness and interpretability of different QA systems. In the first half of this dissertation, we introduce three sub-modules to mitigate the dataset artifacts that models learn from datasets. These sub-modules also enable us to examine and exert explicit control over the intermediate outputs. In the first work, to address question answering that requires multi-hop reasoning, we propose a chain extractor, which extracts the reasoning chains necessary for models to derive the final answer. The reasoning chains not only prevent the model from exploiting reasoning shortcuts but also provide an explanation of how the answer is derived. In the second work, we incorporate an alignment layer between the question and the context before generating the answer. This alignment layer can help us interpret the models' behavior and improve the robustness of adversarial settings. In the third work, we add an answer verifier after QA models generate the answer. This verifier can boost QA models' prediction confidence across several different domains and help us spot cases where QA models predict the right answer for the wrong reason by utilizing the external NLI datasets and models. In the second half of this dissertation, we tackle the problem of complex fact-checking in the real world by treating it as a modularized QA task. We first decompose a complex claim into several yes-no subquestions whose answer directly contributes to the veracity of the claim. Then, each sub-question is fed into a commercial search engine to retrieve relevant documents. Additionally, we extract the relevant snippets in the retrieved documents and use a GPT3-based summarizer to generate the core evidence for checking the claim. We show that the decompositions can play an important role in both evidence retrieval and veracity composition of an explainable fact-checking system. Also, we show the GPT3-based evidence summarizer generates faithful summaries of documents most of the time indicating it can be used as an effective part of the pipeline. Moreover, we annotate a dataset -- ClaimDecomp, containing 1,200 complex claims and the decompositions. We believe that this dataset can further promote building explainable fact-checking systems and analyzing complex claims in the real world.Computer Science

Texas ScholarWorks

Recommended from our members

From active to passive spatial acoustic sensing and applications

Author: Sun Wei (Ph. D. in computer science)
Publication venue
Publication date: 31/03/2023
Field of study

The active acoustic sensing system emits modulated acoustic waves and analyzes reflection signals. It is dominant in acoustic spatial sensing. On the other side, the passive acoustic sensing system receives and investigates nature sounds directly. It is good at semantic tasks but has weak performance on spatial sensing. In this dissertation, we manage to bridge three gaps in existing systems. They are the gap between the assumption of signal processing algorithms and the real acoustic environment, the gap between powerful active spatial sensing and limited passive spatial sensing, and the gap between the semantic features and spatial information. We evolve the acoustic sensing system design and extend the functionalities by three novel systems. First, we develop a fully active spatial sensing system DeepRange which can adapt to the real environment easily. We develop an effective mechanism to generate synthetic training data that captures noise, speaker/mic distortion, and interference in the signals. It removes the need of collecting a large volume of data. We then design a deep range neural network (DRNet) to estimate the distance from raw acoustic signals. It is inspired by signal processing that an ultra-long convolution kernel size helps to combat noise and interference. The model is fully trained over synthetic data, but it can achieve sub-centimeter error robustly in real data despite various environments, background noise, interference, and mobile phone models. Second, we develop a fused active and passive spatial sensing system for speech separation noted as Spatial Aware Multi-task learning-based Separation (SAMS). We leverage both active sensing and passive sensing to improve AoA estimation and jointly optimize the semantic task and the spatial task. SAMS estimates the spatial location and extracts speech for the target user during teleconferencing simultaneously. We first generate fine-grained spatial embeddings from the user’s voice and inaudible tracking sound, which contains the user’s position and rich multipath information. Furthermore, we develop a deep neural network with multi-task learning to jointly optimize source separation and location. We significantly speed up inference to provide a real-time guarantee. Finally, we deeply fuse the semantic features and spatial cues to combat the interference and noise in the real environment as well as enable depth sensing in a fully passive setup. Inspired by the ”flash-to-bang” phenomenon (i.e.hearing the thunder after seeing the lightning), we propose FBDepth to measure the depth of the sound source. We formulate the problem as an audio-visual event localization task for collision events. Specifically, FBDepth first aligns correspondence between the video track and audio track to locate the target object and target sound in a coarse granularity. Based on the observation of moving objects’ trajectories, it proposes to estimate the intersection of optical flow before and after the collision to locate video events in time. It feeds the estimated timestamp of the video event and the other modalities for the final depth estimation. We use a mobile phone to collect the 3.6K+ video clips involving 24 different objects at up to 60m. FBDepth shows superior performance especially at a long range compared to monocular and stereo methods.Computer Science

Texas ScholarWorks